{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "whjsJasuhstV"
   },
   "source": [
    "<a href=\"https://colab.research.google.com/github/jeffheaton/app_generative_ai/blob/main/t81_559_class_03_1_llm.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "euOZxlIMhstX"
   },
   "source": [
    "# T81-559: Applications of Generative Artificial Intelligence\n",
    "**Module 3: Large Language Models**\n",
    "* Instructor: [Jeff Heaton](https://sites.wustl.edu/jeffheaton/), McKelvey School of Engineering, [Washington University in St. Louis](https://engineering.wustl.edu/Programs/Pages/default.aspx)\n",
    "* For more information visit the [class website](https://sites.wustl.edu/jeffheaton/t81-558/)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "d4Yov72PhstY"
   },
   "source": [
    "# Module 3 Material\n",
    "\n",
    "* Part 3.1: Foundation Models [[Video]](https://www.youtube.com/watch?v=Gb0tk5qq1fA) [[Notebook]](t81_559_class_03_1_llm.ipynb)\n",
    "* Part 3.2: Text Generation [[Video]](https://www.youtube.com/watch?v=lB97Lqt7q58) [[Notebook]](t81_559_class_03_2_text_gen.ipynb)\n",
    "* Part 3.3: Text Summarization [[Video]](https://www.youtube.com/watch?v=3MoIUXE2eEU) [[Notebook]](t81_559_class_03_3_text_summary.ipynb)\n",
    "* **Part 3.4: Text Classification** [[Video]](https://www.youtube.com/watch?v=2VpOwFIGmA8) [[Notebook]](t81_559_class_03_4_classification.ipynb)\n",
    "* Part 3.5: LLM Writes a Book [[Video]](https://www.youtube.com/watch?v=iU40Rttlb_Q) [[Notebook]](t81_559_class_03_5_book.ipynb)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "AcAUP0c3hstY"
   },
   "source": [
    "# Google CoLab Instructions\n",
    "\n",
    "The following code ensures that Google CoLab is running and maps Google Drive if needed."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "xsI496h5hstZ",
    "outputId": "b4635734-c5af-48b9-a073-34cd3444471f"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Note: using Google CoLab\n",
      "Requirement already satisfied: langchain in /usr/local/lib/python3.10/dist-packages (0.2.0)\n",
      "Requirement already satisfied: langchain_openai in /usr/local/lib/python3.10/dist-packages (0.1.7)\n",
      "Requirement already satisfied: PyYAML>=5.3 in /usr/local/lib/python3.10/dist-packages (from langchain) (6.0.1)\n",
      "Requirement already satisfied: SQLAlchemy<3,>=1.4 in /usr/local/lib/python3.10/dist-packages (from langchain) (2.0.30)\n",
      "Requirement already satisfied: aiohttp<4.0.0,>=3.8.3 in /usr/local/lib/python3.10/dist-packages (from langchain) (3.9.5)\n",
      "Requirement already satisfied: async-timeout<5.0.0,>=4.0.0 in /usr/local/lib/python3.10/dist-packages (from langchain) (4.0.3)\n",
      "Requirement already satisfied: dataclasses-json<0.7,>=0.5.7 in /usr/local/lib/python3.10/dist-packages (from langchain) (0.6.6)\n",
      "Requirement already satisfied: langchain-core<0.3.0,>=0.2.0 in /usr/local/lib/python3.10/dist-packages (from langchain) (0.2.0)\n",
      "Requirement already satisfied: langchain-text-splitters<0.3.0,>=0.2.0 in /usr/local/lib/python3.10/dist-packages (from langchain) (0.2.0)\n",
      "Requirement already satisfied: langsmith<0.2.0,>=0.1.17 in /usr/local/lib/python3.10/dist-packages (from langchain) (0.1.59)\n",
      "Requirement already satisfied: numpy<2,>=1 in /usr/local/lib/python3.10/dist-packages (from langchain) (1.25.2)\n",
      "Requirement already satisfied: pydantic<3,>=1 in /usr/local/lib/python3.10/dist-packages (from langchain) (2.7.1)\n",
      "Requirement already satisfied: requests<3,>=2 in /usr/local/lib/python3.10/dist-packages (from langchain) (2.31.0)\n",
      "Requirement already satisfied: tenacity<9.0.0,>=8.1.0 in /usr/local/lib/python3.10/dist-packages (from langchain) (8.3.0)\n",
      "Requirement already satisfied: openai<2.0.0,>=1.24.0 in /usr/local/lib/python3.10/dist-packages (from langchain_openai) (1.30.1)\n",
      "Requirement already satisfied: tiktoken<1,>=0.7 in /usr/local/lib/python3.10/dist-packages (from langchain_openai) (0.7.0)\n",
      "Requirement already satisfied: aiosignal>=1.1.2 in /usr/local/lib/python3.10/dist-packages (from aiohttp<4.0.0,>=3.8.3->langchain) (1.3.1)\n",
      "Requirement already satisfied: attrs>=17.3.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp<4.0.0,>=3.8.3->langchain) (23.2.0)\n",
      "Requirement already satisfied: frozenlist>=1.1.1 in /usr/local/lib/python3.10/dist-packages (from aiohttp<4.0.0,>=3.8.3->langchain) (1.4.1)\n",
      "Requirement already satisfied: multidict<7.0,>=4.5 in /usr/local/lib/python3.10/dist-packages (from aiohttp<4.0.0,>=3.8.3->langchain) (6.0.5)\n",
      "Requirement already satisfied: yarl<2.0,>=1.0 in /usr/local/lib/python3.10/dist-packages (from aiohttp<4.0.0,>=3.8.3->langchain) (1.9.4)\n",
      "Requirement already satisfied: marshmallow<4.0.0,>=3.18.0 in /usr/local/lib/python3.10/dist-packages (from dataclasses-json<0.7,>=0.5.7->langchain) (3.21.2)\n",
      "Requirement already satisfied: typing-inspect<1,>=0.4.0 in /usr/local/lib/python3.10/dist-packages (from dataclasses-json<0.7,>=0.5.7->langchain) (0.9.0)\n",
      "Requirement already satisfied: jsonpatch<2.0,>=1.33 in /usr/local/lib/python3.10/dist-packages (from langchain-core<0.3.0,>=0.2.0->langchain) (1.33)\n",
      "Requirement already satisfied: packaging<24.0,>=23.2 in /usr/local/lib/python3.10/dist-packages (from langchain-core<0.3.0,>=0.2.0->langchain) (23.2)\n",
      "Requirement already satisfied: orjson<4.0.0,>=3.9.14 in /usr/local/lib/python3.10/dist-packages (from langsmith<0.2.0,>=0.1.17->langchain) (3.10.3)\n",
      "Requirement already satisfied: anyio<5,>=3.5.0 in /usr/local/lib/python3.10/dist-packages (from openai<2.0.0,>=1.24.0->langchain_openai) (3.7.1)\n",
      "Requirement already satisfied: distro<2,>=1.7.0 in /usr/lib/python3/dist-packages (from openai<2.0.0,>=1.24.0->langchain_openai) (1.7.0)\n",
      "Requirement already satisfied: httpx<1,>=0.23.0 in /usr/local/lib/python3.10/dist-packages (from openai<2.0.0,>=1.24.0->langchain_openai) (0.27.0)\n",
      "Requirement already satisfied: sniffio in /usr/local/lib/python3.10/dist-packages (from openai<2.0.0,>=1.24.0->langchain_openai) (1.3.1)\n",
      "Requirement already satisfied: tqdm>4 in /usr/local/lib/python3.10/dist-packages (from openai<2.0.0,>=1.24.0->langchain_openai) (4.66.4)\n",
      "Requirement already satisfied: typing-extensions<5,>=4.7 in /usr/local/lib/python3.10/dist-packages (from openai<2.0.0,>=1.24.0->langchain_openai) (4.11.0)\n",
      "Requirement already satisfied: annotated-types>=0.4.0 in /usr/local/lib/python3.10/dist-packages (from pydantic<3,>=1->langchain) (0.6.0)\n",
      "Requirement already satisfied: pydantic-core==2.18.2 in /usr/local/lib/python3.10/dist-packages (from pydantic<3,>=1->langchain) (2.18.2)\n",
      "Requirement already satisfied: charset-normalizer<4,>=2 in /usr/local/lib/python3.10/dist-packages (from requests<3,>=2->langchain) (3.3.2)\n",
      "Requirement already satisfied: idna<4,>=2.5 in /usr/local/lib/python3.10/dist-packages (from requests<3,>=2->langchain) (3.7)\n",
      "Requirement already satisfied: urllib3<3,>=1.21.1 in /usr/local/lib/python3.10/dist-packages (from requests<3,>=2->langchain) (2.0.7)\n",
      "Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.10/dist-packages (from requests<3,>=2->langchain) (2024.2.2)\n",
      "Requirement already satisfied: greenlet!=0.4.17 in /usr/local/lib/python3.10/dist-packages (from SQLAlchemy<3,>=1.4->langchain) (3.0.3)\n",
      "Requirement already satisfied: regex>=2022.1.18 in /usr/local/lib/python3.10/dist-packages (from tiktoken<1,>=0.7->langchain_openai) (2023.12.25)\n",
      "Requirement already satisfied: exceptiongroup in /usr/local/lib/python3.10/dist-packages (from anyio<5,>=3.5.0->openai<2.0.0,>=1.24.0->langchain_openai) (1.2.1)\n",
      "Requirement already satisfied: httpcore==1.* in /usr/local/lib/python3.10/dist-packages (from httpx<1,>=0.23.0->openai<2.0.0,>=1.24.0->langchain_openai) (1.0.5)\n",
      "Requirement already satisfied: h11<0.15,>=0.13 in /usr/local/lib/python3.10/dist-packages (from httpcore==1.*->httpx<1,>=0.23.0->openai<2.0.0,>=1.24.0->langchain_openai) (0.14.0)\n",
      "Requirement already satisfied: jsonpointer>=1.9 in /usr/local/lib/python3.10/dist-packages (from jsonpatch<2.0,>=1.33->langchain-core<0.3.0,>=0.2.0->langchain) (2.4)\n",
      "Requirement already satisfied: mypy-extensions>=0.3.0 in /usr/local/lib/python3.10/dist-packages (from typing-inspect<1,>=0.4.0->dataclasses-json<0.7,>=0.5.7->langchain) (1.0.0)\n"
     ]
    }
   ],
   "source": [
    "import os\n",
    "\n",
    "try:\n",
    "    from google.colab import drive, userdata\n",
    "    COLAB = True\n",
    "    print(\"Note: using Google CoLab\")\n",
    "except:\n",
    "    print(\"Note: not using Google CoLab\")\n",
    "    COLAB = False\n",
    "\n",
    "# OpenAI Secrets\n",
    "\n",
    "# OpenAI Secrets\n",
    "if COLAB:\n",
    "    os.environ[\"OPENAI_API_KEY\"] = userdata.get('OPENAI_API_KEY')\n",
    "\n",
    "# Install needed libraries in CoLab\n",
    "if COLAB:\n",
    "    !pip install langchain langchain_openai"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "pC9A-LaYhsta"
   },
   "source": [
    "# 3.4: Text Classification\n",
    "\n",
    "\n",
    "Large Language Models (LLMs) have revolutionized the field of text classification by offering a more flexible and efficient approach compared to traditional machine learning methods. Traditionally, text classification required extensive labeled datasets for training machine learning models. This process was time-consuming and resource-intensive, as it involved manually annotating a large number of examples for each category.\n",
    "\n",
    "In contrast, LLMs, such as those based on the GPT architecture, can perform text classification using a technique known as \"zero-shot\" learning. This innovative method allows the model to classify text without needing any labeled examples beforehand. Instead of relying on a dataset of annotated examples, we provide the LLM with a natural language description of the classification task. The model then uses its vast knowledge, acquired from training on diverse and extensive text data, to understand and perform the classification based on the given description.\n",
    "\n",
    "This zero-shot capability significantly reduces the time and effort required to set up a text classification system. It also enables rapid adaptation to new and evolving classification tasks, making LLMs a powerful tool for various applications, from sentiment analysis and topic categorization to spam detection and content moderation.\n",
    "\n",
    "The following code gets sample emails that a professor might get from faculty, students and family members. We will write code to classify these emails.\n",
    "\n",
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "id": "jzYI0GzNqcEq"
   },
   "outputs": [],
   "source": [
    "import requests\n",
    "\n",
    "def get_email(i):\n",
    "    if i<1 or i>10:\n",
    "        raise Exception(\"Invalid email number\")\n",
    "\n",
    "    # URL to download\n",
    "    url = f\"https://data.heatonresearch.com/wustl/CABI/genai-langchain/emails/email_{i}.txt\"\n",
    "\n",
    "    # Perform a GET request to the URL\n",
    "    response = requests.get(url)\n",
    "\n",
    "    # Check if the request was successful\n",
    "    if response.status_code == 200:\n",
    "        # Convert the content of the response to a string\n",
    "        content = response.text\n",
    "        return content\n",
    "    else:\n",
    "        raise Exception(\"Failed to retrieve the content\")\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "IO7NnorPq18c"
   },
   "source": [
    "For example, the following displays email \\#1."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "RJT30deCqhjs",
    "outputId": "640dfae5-ac8b-48fe-ca9d-ab8f65401717"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Dear Professor Lawson,\n",
      "\n",
      "I'm Alex Chen, leader of team Nova in the Data Science Challenge (Spring 2019).\n",
      "\n",
      "I am seeking to pursue a PhD in Computer Science, and I am hoping you could provide a recommendation letter for my applications.\n",
      "\n",
      "This term has been unexpectedly demanding, leading to delays in preparing my applications. I acknowledge the timing is not ideal, but I am committed to completing my applications diligently.\n",
      "\n",
      "Your guidance in CSC 402 Advanced Machine Learning and your support in my research project have been invaluable. I was particularly drawn to this field, prompting me to enroll in your course despite it not contributing to my major credits. The course proved to be highly beneficial, enriching my understanding through online lectures and practical assignments. I excelled in the course, securing an A+ and leading my team to the top position in the semester's Data Science Challenge.\n",
      "\n",
      "In the previous term (Fall 2019), I encountered a challenging issue in my project on \"Automated Segmentation of Cardiac MRI Images\". After discussing this in a campus session, you introduced me to a seminal paper on convolutional networks, which greatly influenced my approach to developing an image segmentation model using advanced neural network techniques.\n",
      "\n",
      "Looking ahead, I aim to delve deeper into Machine Learning and computational models. The knowledge acquired from your course is sure to be a significant advantage in my future studies.\n",
      "\n",
      "I have attached my most recent CV and academic transcript for your review. I am applying to approximately 10 universities, with the earliest deadline on January 5th. I would appreciate the opportunity to discuss this further. Thank you very much, and I hope you have a wonderful holiday season!\n",
      "\n",
      "Best regards,\n",
      "Alex Chen\n"
     ]
    }
   ],
   "source": [
    "print(get_email(1))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "8cdf6QNZrC02"
   },
   "source": [
    "Now create a program that loops over all 10 and classifies each email as one of the following.\n",
    "\n",
    "* spam - For marketing emails trying to sell something\n",
    "* faculty - For faculty annoucements and requests\n",
    "* help - For students requesting help on an assignment\n",
    "* lor - For students requesting a letter of recommendation\n",
    "\n",
    "If the email fits into none of these, then classify it as \"other\". If the email is a request for help on an assignment, determine the assignment number the student is asking about.\n",
    "\n",
    "The following is sample is how we will output.\n",
    "\n",
    "```\n",
    "Email #1 is: spam\n",
    "Email #2 is: other\n",
    "Email #3 is: ...\n",
    "\n",
    "```\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "id": "jG0ra9gvs1vO"
   },
   "outputs": [],
   "source": [
    "from langchain_openai import ChatOpenAI\n",
    "\n",
    "MODEL = 'gpt-4o'\n",
    "TEMPERATURE = 0.0\n",
    "\n",
    "# Initialize the OpenAI LLM with your API key\n",
    "llm = ChatOpenAI(\n",
    "    model=MODEL,\n",
    "    temperature=TEMPERATURE,\n",
    "    n=1\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "4M4AUGRRs94D"
   },
   "source": [
    "This code demonstrates the use of a large language model (LLM) for classifying and extracting information from emails using the Langchain framework. The process starts by importing necessary classes and modules from Langchain, including HumanMessage, SystemMessage, AIMessage, ChatPromptTemplate, HumanMessagePromptTemplate, SystemMessagePromptTemplate, PromptTemplate, LLMChain, and SimpleSequentialChain.\n",
    "\n",
    "Two prompt templates are defined using PromptTemplate. The first template, email_prompt, takes an email as input and asks the model to classify it into one of five categories: spam, faculty, help, lor (letter of recommendation), or other. The second template, help_prompt, is designed to extract the assignment number from emails classified as \"help,\" where students are requesting assistance with an assignment.\n",
    "\n",
    "The code then creates two LLM chains, chain_email and chain_help, by combining the respective prompts with the LLM. These chains are used to process the emails.\n",
    "\n",
    "The main loop iterates over a range of 10 emails (from 1 to 10). For each email, it retrieves the email content using the get_email(i) function. The email content is then classified using chain_email.invoke(email), which returns the classification as a string. If the classification is \"help,\" the email is further processed with chain_help.invoke(email) to extract the assignment number. The extracted information is printed accordingly: either indicating the assignment number for help-related emails or the classification for other types of emails."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "zc92unRdrDQC",
    "outputId": "12f98c51-1c94-43aa-ac74-59583f1241d1"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Email #1 is: lor\n",
      "Email #2 is a question about assignment #7\n",
      "Email #3 is: faculty\n",
      "Email #4 is: lor\n",
      "Email #5 is: spam\n",
      "Email #6 is: other\n",
      "Email #7 is a question about assignment #3\n",
      "Email #8 is: faculty\n",
      "Email #9 is: spam\n",
      "Email #10 is: other\n"
     ]
    }
   ],
   "source": [
    "from langchain_core.messages import HumanMessage, SystemMessage, AIMessage\n",
    "from langchain_core.prompts.chat import (\n",
    "    ChatPromptTemplate,\n",
    "    HumanMessagePromptTemplate,\n",
    "    SystemMessagePromptTemplate,\n",
    ")\n",
    "from langchain.prompts import PromptTemplate\n",
    "from langchain.chains import LLMChain, SimpleSequentialChain\n",
    "\n",
    "email_prompt = PromptTemplate( input_variables = ['email'], template = \"\"\"\n",
    "Classify the following email as either:\n",
    "* spam - For marketing emails trying to sell something\n",
    "* faculty - For faculty annoucements and requests\n",
    "* help - For students requesting help on an assignment\n",
    "* lor - For students requesting a letter of recommendation\n",
    "* other - If it does not fit into any of these.\n",
    "Here is the email. Return code, such as spam. Return nothing else, do not explain your choice.\n",
    "Make sure that if the email does not fit into one of the categories that you classify it as other.\n",
    "Here is the email:\n",
    "\n",
    "{email}\"\"\")\n",
    "\n",
    "help_prompt = PromptTemplate( input_variables = ['email'], template = \"\"\"\n",
    "You are given an email where a student is asking about an assignment. Return the assignment number\n",
    "that they are asking about. If you cannot tell return a ?. Return only the assignment number as\n",
    "an integer, do not explain.\n",
    "Here is the email:\n",
    "\n",
    "{email}\"\"\")\n",
    "\n",
    "chain_email = email_prompt | llm\n",
    "chain_help = help_prompt | llm\n",
    "\n",
    "for i in range(1,11):\n",
    "    email = get_email(i)\n",
    "    classification = chain_email.invoke(email).content.strip()\n",
    "    if classification == 'help':\n",
    "        assignment = chain_help.invoke(email).content.strip()\n",
    "        print(f\"Email #{i} is a question about assignment #{assignment}\")\n",
    "    else:\n",
    "        print(f\"Email #{i} is: {classification}\")\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "ibm452B-uURM"
   },
   "source": [
    "## Classify Names\n",
    "\n",
    "Spell checkers can be tricky to code. With the large diversity of names around the world, sometimes people's names are flagged as typos. There is a campaign named [#iamnotatypo](https://www.iamnotatypo.org/) that attempts to build a list of people's names; however, maybe AI can help. As a human, usually I can determine if words are a name just from context. Consider the following paragraph where many of the names would be flagged by MS-Word as typos. Can GenAI find all the names in this paragraph? Just to make it more like text a human would generate, I \"forgot\" to capitalize a few of the names.\n",
    "\n",
    "Make use of LangChain to extract just the names, as a Python list."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "VoT0rcx1uzvG"
   },
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "id": "MlC_0wQ3vEzn"
   },
   "outputs": [],
   "source": [
    "SAMPLE = \"\"\"\n",
    "Esmae and Zarah have been friends since college. After graduation they interviewed with priti\n",
    "for a job at the technotryne corporation, whose flagship product is named Futuricore, which\n",
    "was developed by a programmer named María José García Martínez. After a successful interview\n",
    "they met with their friends Rafe and Ayda to celebrate good times. Their first day on the job,\n",
    "they met three other employees, Ruaridh, Eesa, and Otillie. Later they were joined by\n",
    "Matei Smith.\"\"\""
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "taGW_e6xvKJM"
   },
   "source": [
    "Now we define an LLM to parse through the above paragraph."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "id": "Crvlr5aFuWks"
   },
   "outputs": [],
   "source": [
    "from langchain_openai import ChatOpenAI\n",
    "\n",
    "MODEL = 'gpt-4o-mini'\n",
    "TEMPERATURE = 0.0\n",
    "\n",
    "# Initialize the OpenAI LLM with your API key\n",
    "llm = ChatOpenAI(\n",
    "    model=MODEL,\n",
    "    temperature=TEMPERATURE,\n",
    "    n=1\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "4r8S8zI0ua8U"
   },
   "source": [
    "Finally, we use a one-shot prompt to classify and extract these names."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "id": "zeEyWY1TuZrs",
    "outputId": "b97fd660-f68e-4b9c-af2c-167d5b83db9f"
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Esmae\n",
      "Zarah\n",
      "Priti\n",
      "María José García Martínez\n",
      "Rafe\n",
      "Ayda\n",
      "Ruaridh\n",
      "Eesa\n",
      "Otillie\n",
      "Matei Smith\n"
     ]
    }
   ],
   "source": [
    "from langchain_core.prompts import PromptTemplate\n",
    "\n",
    "prompt = PromptTemplate(\n",
    "    template=\"\"\"\n",
    "    Find all human names in the following text. Extract complete name, return only names.\\n\n",
    "    Output each name on a separate line.\\n\n",
    "    {subject}.\\n\"\"\",\n",
    "    input_variables=[\"subject\"]\n",
    ")\n",
    "\n",
    "chain = prompt | llm\n",
    "\n",
    "print(chain.invoke({\"subject\": SAMPLE}).content)"
   ]
  }
 ],
 "metadata": {
  "anaconda-cloud": {},
  "colab": {
   "provenance": []
  },
  "kernelspec": {
   "display_name": "Python 3.11 (genai)",
   "language": "python",
   "name": "genai"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.11.8"
  },
  "varInspector": {
   "cols": {
    "lenName": 16,
    "lenType": 16,
    "lenVar": 40
   },
   "kernels_config": {
    "python": {
     "delete_cmd_postfix": "",
     "delete_cmd_prefix": "del ",
     "library": "var_list.py",
     "varRefreshCmd": "print(var_dic_list())"
    },
    "r": {
     "delete_cmd_postfix": ") ",
     "delete_cmd_prefix": "rm(",
     "library": "var_list.r",
     "varRefreshCmd": "cat(var_dic_list()) "
    }
   },
   "types_to_exclude": [
    "module",
    "function",
    "builtin_function_or_method",
    "instance",
    "_Feature"
   ],
   "window_display": false
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
